skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Mou, Wenlong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We study stochastic approximation procedures for approximately solving a $$d$$-dimensional linear fixed point equation based on observing a trajectory of length $$n$$ from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the order $$t_{\mathrm{mix}} \tfrac{d}{n}$$ on the squared error of the last iterate of a standard scheme, where $$t_{\mathrm{mix}}$$ is a mixing time. We then prove a non-asymptotic instance-dependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters $$(d, t_{\mathrm{mix}})$$ in the higher order terms. We complement these upper bounds with a non-asymptotic minimax lower bound that establishes the instance-optimality of the averaged SA estimator. We derive corollaries of these results for policy evaluation with Markov noise—covering the TD($$\lambda$$) family of algorithms for all $$\lambda \in [0, 1)$$—and linear autoregressive models. Our instance-dependent characterizations open the door to the design of fine-grained model selection procedures for hyperparameter tuning (e.g., choosing the value of $$\lambda$$ when running the TD($$\lambda$$) algorithm). 
    more » « less
  2. null (Ed.)
  3. null (Ed.)
  4. null (Ed.)